The Relation of Closed Itemset Mining, Complete Pruning Strategies and Item Ordering in Apriori-Based FIM Algorithms
نویسندگان
چکیده
In this paper we investigate the relationship between closed itemset mining, the complete pruning technique and item ordering in the Apriori algorithm. We claim, that when proper item order is used, complete pruning does not necessarily speed up Apriori, and in databases with certain characteristics, pruning increases run time significantly. We also show that if complete pruning is applied, then an intersection-based technique not only results in a faster algorithm, but we get free closeditemset selection concerning both memory consumption and run-time. The theoretical claims are supported by results from a comprehensive set of experiments, involving hundreds of tests on numerous databases with different support thresholds.
منابع مشابه
Efficient Mining of Association Rulesusing Closed
| Discovering association rules is one of the most important task in data mining. Many eecient algorithms have been proposed in the literature. The most noticeable are Apriori, Mannila's algorithm, Partition, Sampling and DIC, that are all based on the Apriori mining method: pruning the subset lattice (itemset lattice). In this paper we propose an eecient algorithm, called Close, based on a new...
متن کاملAccelerating Parallel Frequent Itemset Mining on Graphics Processors with Sorting
Frequent Itemset Mining (FIM) is one of the most investigated fields of data mining. The goal of Frequent Itemset Mining (FIM) is to find the most frequently-occurring subsets from the transactions within a database. Many methods have been proposed to solve this problem, and the Apriori algorithm is one of the best known methods for frequent Itemset mining (FIM) in a transactional database. In ...
متن کاملWFIM: Weighted Frequent Itemset Mining with a weight range and a minimum weight
Researchers have proposed weighted frequent itemset mining algorithms that reflect the importance of items. The main focus of weighted frequent itemset mining concerns satisfying the downward closure property. All weighted association rule mining algorithms suggested so far have been based on the Apriori algorithm. However, pattern growth algorithms are more efficient than Apriori based algorit...
متن کاملA Probability Analysis for Frequent Itemset Mining Algorithms
Since the introduction of the Frequent Itemset Mining (FIM) problem, several different algorithms for solving it were proposed and experimentally analyzed. Our work focusses on the theoretical analysis of FIM. The aim is to give a detailed probabilistic study of the performance of FIM algorithms for different data distributions. It is joint work with Dirk Van Gucht and Paul Purdom from Indiana ...
متن کاملA New Algorithm for High Average-utility Itemset Mining
High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005